---
title: 'Are commitment contracts truly ineffective in augmenting health behaviors? A closer examination of Bai et al (2020)^[Access to replication dataset and files at @DVN/761OT1_2020. Access: https://doi.org/10.7910/DVN/761OT1]'
author: "Mike Eber, Han Zhang, Kai Shen Lim"
date: "`r Sys.Date()`"
abstract: "We replicate and extend the results of @bai_2020, which examines the effect of commitment contracts on attendance at preventive health care visits and health outcomes among hypertensive patients in India. @bai_2020 report that commitment contracts, which are designed to remedy self-control problems, can sometimes be welfare-harming when the people who sign up for contracts fail to follow through with their plans. The basis of identification in @bai_2020's paper is the randomization of patients into six different experimental arms that offer commitment contracts and/or discounts to patients. We start by examining the potential failures of the randomization process, and we find imbalance in observable covariates for both the commitment contract treatments and the discount treatments. We also explore different model specifications and distributional assumptions to see if alternative methods lead to different conclusions or interpretations. Our results support the findings of @bai_2020, suggesting that in this setting, commitment contracts do not appear to have beneficial effects on preventive health behaviors or health outcomes. However, we find weak evidence that the effects of commitment contracts are heterogeneous across population subgroups, suggesting that the benefits of commitment contracts are likely to be more nuanced. "
output: 
  pdf_document:
    number_sections: true
header-includes:
   - \usepackage{dcolumn}
   - \usepackage{float}  
   - \usepackage{setspace}\doublespacing   
   - \usepackage{rotating}
bibliography: R_30.11.2020.bib
indent: true    
---

\pagebreak

# Introduction

Commitment contracts have been proposed as a solution to self-control problems; these problems arise when an individual has different preferences for how to act now compared with in the future [@bond2018commitment]. For example, individuals may wish to save for retirement or exercise regularly, but they may not accomplish their goals due to biases such as present bias, which may make them focus on actions that have short-term benefits rather than prioritizing investments that have returns farther in the future.  In the domain of health behavior, commitment contracts have been noted as a promising area for policymakers, but they have not been demonstrated to have consistently beneficial effects [@halpern2012]. Reporting on the results of a randomized trial in India, @bai_2020 did not find that commitment contracts were associated with substantially greater rates of use of preventive health services.

Existing literature have demonstrated that people have the demand for commitment contracts across a variety of domains, including savings [@ashraf_tying_2005; @dupas_why_2013], effort tasks [@exley_observability_2016], and health [@gine_put_2010; @royer_incentives_2015, Halpern et al 2015, @schilbach_alcohol_2019; @toussaert_eliciting_2018]. While some studies show that people exhibited significant demand for commitment in savings and effort tasks [@dupas_why_2013; @exley_observability_2016], most of the studies in health reported a much smaller demand for commitment to healthy behavior in terms of the magnitude [@gine_put_2010; @royer_incentives_2015; @halpern_randomized_2015], except for special sub-groups such as rickshaw drivers [@schilbach_alcohol_2019], chocolate-eating children [@alan_patience_2015], or university faculty [@toussaert_eliciting_2018].  

For the study we replicated, @bai_2020 reported a relatively high take-up rate of 39% for personalized and subsidized contracts and a rate of 14% for unsubsidized contracts among hypertensive patients. However, they find that the relatively high take-up rate in some treatment arms didn’t translate into subsequent behavioral changes on doctor visits and health outcomes. A variety of behavioral economics literature has provided explanations that may shed light on this seemingly contradictory phenomenon. @carrera_who_2019 provided an explanation from the angel of stochastic valuation errors of contract value and perceived social pressure. Specifically, they argued that people can be biased by imperfect perception of the contract value, falsely trust that the authority offers something valuable, or feel pressure to not take up the offers from experimenter at the point of sign-up. From a theoretical perspective, @heidhues_futile_2009 predicts that commitment contracts do not work well for some agents as they are overconfident about their future self-control and cannot follow through the commitment despite of take-up. While @bai_2020's paper cannot not fully rule out the first explanation, they developed the main hypothesis based on the second explanation.

In the following section, we replicate that main results from @bai_2020, focusing specifically on the reduced form estimations. We then proceed by outlining the extensions we explored using the original authors' dataset. This included a balance check across treatment and control arms and a matching exercise which results from our findings that there might be potential imbalance across treatment and control arm. We also explored alternative models to abate concerns with model dependency, and finally a subgroup analysis. 

# Replication results

This section reproduces @bai_2020's paper on "Self-Control and Demand for Preventive Health: Evidence from Hypertension in India". We chose to replicate Tables 1, 2 (Panel A and B), and 3 (Panel A and B) from the paper, corresponding to Tables 1-5 in our results. These tables correspond to the summary of descriptive statistics and reduced-form regressions showing the effect of the experimental interventions on three outcomes (observed take-up of commitment contracts, any observed preventive health visit, and proportion of preventive health visits attended). We chose to reproduce these results because they represent the core findings of the paper and the best starting points for exploring how the results may vary across model specifications and distributional assumptions. Our replicated results are identical to the original results.

## Summary statistics table

\begin{table}[h]\centering \caption{Summary statistics\label{sumstat}}
\begin{tabular}{l c c c c c }\hline\hline 
\multicolumn{1}{c}{Variable} & Obs & Mean & Std. Dev.
 & Min & Max  \\ \hline 
 Panel A: Demographic Characteristics \\ \hline
Household Size & 1729 & 5.509 & 2.261 & 1 & 21  \\
Gender & 1728 & .409 & .492 & 0 & 1  \\
Age & 1729 & 53.66 & 14.339 & 30 & 105  \\
HH HH Literate & 933 & .446 & .497 & 0 & 1  \\
Household Income (1000 Rs) & 1539 & 101.855 & 161.489 & 0 & 2000  \\
HH Head Self-Employed Agriculture & 908 & .372 & .484 & 0 & 1  \\ \hline

 Panel B: Baseline Health Indicators \\ \hline


Blood pressure (systolic) & 1718 & 142.438 & 24.451 & 87 & 264  \\
Blood pressure (diastolic) & 1718 & 85.844 & 14.05 & 37 & 182  \\
Weight & 1677 & 64.925 & 14.697 & 28.3 & 164.9  \\
Pre-hypertension  & 1718 & .864 & .342 & 0 & 1  \\
Hypertension & 1718 & .577 & .494 & 0 & 1  \\
Overweight & 1677 & .499 & .5 & 0 & 1  \\
Obesity & 1677 & .175 & .38 & 0 & 1  \\

\hline

 Panel C: Take-up and Service Utilization \\ \hline

Take-up contract & 1725 & .154 & .361 & 0 & 1  \\
Proportion of doctor visits & 1725 & .069 & .269 & 0 & 4  \\
Any doctor visit & 1725 & .115 & .319 & 0 & 1  \\

\hline

 Panel D: Endline Health Indicators \\ \hline


Blood pressure (systolic) & 1512 & 138.78 & 24.187 & 64 & 245  \\
Blood pressure (diastolic) & 1512 & 85.023 & 15.417 & 12 & 161  \\
Weight & 1512 & 65.877 & 14.612 & 29.7 & 119.4  \\
Pre-hypertension & 1512 & .821 & .383 & 0 & 1  \\
Hypertension & 1512 & .496 & .5 & 0 & 1  \\
Overweight & 1470 & .537 & .499 & 0 & 1  \\
Obesity & 1470 & .203 & .402 & 0 & 1  \\

\hline
\end{tabular}
\end{table}

\pagebreak

## Regression Results
```{r message = FALSE, warning = FALSE, echo = FALSE}
library(readstata13)
df <- read.dta13('../input/BHMR_RESTAT_reduced_form.dta')
treatment_var <- c('discnt, fixcc_std, flexcc_std, fixcc_discnt, flexcc_discnt')
characteristics <- c('hh_size, gender, age, literate, hh_income, seagri, bp_sys, bp_dia, weight, hbp_freq_check, tehp')

# Data clean
df$trust_ehp[df$tehp == 1] <- 1 
df$trust_ehp[df$tehp == 2] <- 1 
df$trust_ehp[df$tehp == 3] <- 1 
df$trust_ehp[df$tehp == 4] <- 0 
df$trust_ehp[df$tehp == 5] <- 0

df$hyper_bl[df$bp_sys < 140 & df$bp_dia < 90] <- 0
df$hyper_bl[df$bp_sys >= 140 | df$bp_dia >= 90] <- 1

df$prehyper_bl[df$bp_sys < 120 & df$bp_dia < 80] <- 0
df$prehyper_bl[(df$bp_sys >= 120 & df$bp_sys < 140) | (df$bp_dia >= 80 & df$bp_dia < 90)] <- 1
df$prehyper_bl[df$hyper_bl == 1] <- 1

df$obesity_bl[df$bmi < 30] <- 0
df$obesity_bl[df$bmi >= 30] <- 1

df$overweight_bl[df$bmi < 25] <- 0
df$overweight_bl[df$bmi >= 25] <- 1

# Listing/Generating Outcome Variables and Endline Health Measures
df$visitfrac <- df$visittotal/3

df$hyper_el[df$bp_sys_el < 140 & df$bp_dia_el < 90] <- 0
df$hyper_el[df$bp_sys_el >= 140 | df$bp_dia_el >= 90] <- 1

df$prehyper_el[df$bp_sys_el < 120 & df$bp_dia_el < 80] <- 0
df$prehyper_el[(df$bp_sys_el >= 120 & df$bp_sys_el < 140) | (df$bp_dia_el >= 80 & df$bp_dia_el < 90)] <- 1
df$prehyper_el[df$hyper_el == 1] <- 1

df$height <- df$height/100
df$bmi_el <- df$weight_el/(df$height)^2

df$bmi_el[df$bmi_el > 100] <- NA

df$obesity_el[df$bmi_el < 30] <- 0
df$obesity_el[df$bmi_el >= 30] <- 1

df$overweight_el[df$bmi_el < 25] <- 0
df$overweight_el[df$bmi_el >= 25] <- 1

outcomes <- c('takeup, visitfrac, anyvisit, bp_sys_el, bp_dia_el, weight_el, hyper_bl, hyper_el, obesity_bl, obesity_el, overweight_bl, overweight_el')

# Generating Sample Indicators
df$PanelA_Sample <- 0 
df$PanelA_Sample[is.na(df$takeup) == 0 & is.na(df$anyvisit) ==0 & is.na(df$visittotal) == 0]<- 1

df$PanelB_Sample <- 0 
df$PanelB_Sample[is.na(df$weight_el) == 0 & is.na(df$bp_sys_el) == 0 & is.na(df$bp_dia_el) == 0] <- 1

df$target_sample <- 0 
df$target_sample[df$hbp_freq_check == 1 & df$trust_ehp ==1]<- 1

df$Target_PanelA_Sample <- 0 
df$Target_PanelA_Sample[df$hbp_freq_check == 1 & df$trust_ehp ==1 & is.na(df$takeup) == 0 & is.na(df$anyvisit) ==0 & is.na(df$visittotal) == 0]<- 1

df$Target_PanelB_Sample <- 0 
df$Target_PanelB_Sample[df$hbp_freq_check == 1 & df$trust_ehp ==1 & is.na(df$weight_el) == 0 & is.na(df$bp_sys_el) ==0 & is.na(df$bp_dia_el) == 0]<- 1

# Readjust some variables
df$hh_income <- df$hh_income/1000
```


```{r results = 'asis', message = FALSE, warning = FALSE, echo = FALSE}
# Regression for Table 2, Panel A
library(lmtest)
library(sandwich)
library(stargazer)

takeup_reg <- lm(takeup ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt, data=subset(df, PanelA_Sample == 1))

anyvisit_reg <- lm(anyvisit ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt, data=subset(df, PanelA_Sample == 1))

visitfrac_reg <- lm(visitfrac ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt, data=subset(df, PanelA_Sample == 1))

covtakeup <- vcovHC(takeup_reg, type="HC1")
robust_se_takeup <- sqrt(diag(covtakeup))

covanyvisit <- vcovHC(anyvisit_reg, type="HC1")
robust_se_anyvisit <- sqrt(diag(covanyvisit))

covvisitfrac <- vcovHC(anyvisit_reg, type="HC1")
robust_se_visitfrac <- sqrt(diag(covvisitfrac))

stargazer(takeup_reg, anyvisit_reg, visitfrac_reg, type="latex", se = list(robust_se_takeup, robust_se_anyvisit, robust_se_visitfrac),  title = "Take-up and Service Utilization, Full Sample", dep.var.labels = c("Take-up", "Any visit", "Proportion of visits"), covariate.labels = c("Discount", "Fixed CC", "Personalized CC", "Fixed CC + Discount", "Personalized CC + Discount"), align=TRUE, no.space=TRUE, header=FALSE)
```

```{r results = 'asis', message = FALSE, warning = FALSE, echo = FALSE}
# Regression for Table 2, Panel B
library(lmtest)
library(sandwich)
library(stargazer)

bp_sys_el_reg <- lm(bp_sys_el ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt + bp_sys + bp_dia + weight, data=subset(df, PanelB_Sample == 1))

bp_dia_el_reg <- lm(bp_dia_el ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt + bp_sys + bp_dia + weight, data=subset(df, PanelB_Sample == 1))

weight_el_reg <- lm(weight_el ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt + bp_sys + bp_dia + weight, data=subset(df, PanelB_Sample == 1))

covbp_sys_el <- vcovHC(bp_sys_el_reg, type="HC1")
robust_se_bp_sys_el <- sqrt(diag(covbp_sys_el))

covbp_dia_el <- vcovHC(bp_dia_el_reg, type="HC1")
robust_se_bp_dia_el <- sqrt(diag(covbp_dia_el))

covweight_el <- vcovHC(weight_el_reg, type="HC1")
robust_se_weight_el <- sqrt(diag(covweight_el))

stargazer(bp_sys_el_reg, bp_dia_el_reg, weight_el_reg, type="latex", se = list(robust_se_bp_sys_el, robust_se_bp_dia_el, robust_se_weight_el),  title = "Endline Health Outcomes, Full Sample", dep.var.labels = c("BP (systolic)", "BP (diastolic)", "Weight (kg)"), covariate.labels = c("Discount", "Fixed CC", "Personalized CC", "Fixed CC + Discount", "Personalized CC + Discount"), align=TRUE, no.space=TRUE, header=FALSE, omit = c("bp_sys", "bp_dia", "weight"))
```

```{r results = 'asis', message = FALSE, warning = FALSE, echo = FALSE}
# Regression for Table 3, Panel A
library(lmtest)
library(sandwich)
library(stargazer)

takeup_reg2 <- lm(takeup ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt, data=subset(df, Target_PanelA_Sample == 1))

anyvisit_reg2 <- lm(anyvisit ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt, data=subset(df, Target_PanelA_Sample == 1))

visitfrac_reg2 <- lm(visitfrac ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt, data=subset(df, Target_PanelA_Sample == 1))

covtakeup2 <- vcovHC(takeup_reg2, type="HC1")
robust_se_takeup2 <- sqrt(diag(covtakeup2))

covanyvisit2 <- vcovHC(anyvisit_reg2, type="HC1")
robust_se_anyvisit2 <- sqrt(diag(covanyvisit2))

covvisitfrac2 <- vcovHC(anyvisit_reg2, type="HC1")
robust_se_visitfrac2 <- sqrt(diag(covvisitfrac2))

stargazer(takeup_reg2, anyvisit_reg2, visitfrac_reg2, type="latex", se = list(robust_se_takeup2, robust_se_anyvisit2, robust_se_visitfrac2),  title = "Take-up and Service Utilization, Ideal Sample", dep.var.labels = c("Take-up", "Any visit", "Proportion of visits"), covariate.labels = c("Discount", "Fixed CC", "Personalized CC", "Fixed CC + Discount", "Personalized CC + Discount"), align=TRUE, no.space=TRUE, header=FALSE)
```


```{r results = 'asis', message = FALSE, warning = FALSE, echo = FALSE}
# Regression for Table 3, Panel B
library(lmtest)
library(sandwich)
library(stargazer)

bp_sys_el_reg2 <- lm(bp_sys_el ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt + bp_sys + bp_dia + weight, data=subset(df, Target_PanelB_Sample == 1))

bp_dia_el_reg2 <- lm(bp_dia_el ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt + bp_sys + bp_dia + weight, data=subset(df, Target_PanelB_Sample == 1))

weight_el_reg2 <- lm(weight_el ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt + bp_sys + bp_dia + weight, data=subset(df, Target_PanelB_Sample == 1))

covbp_sys_el2 <- vcovHC(bp_sys_el_reg2, type="HC1")
robust_se_bp_sys_el2 <- sqrt(diag(covbp_sys_el2))

covbp_dia_el2 <- vcovHC(bp_dia_el_reg2, type="HC1")
robust_se_bp_dia_el2 <- sqrt(diag(covbp_dia_el2))

covweight_el2 <- vcovHC(weight_el_reg2, type="HC1")
robust_se_weight_el2 <- sqrt(diag(covweight_el2))

stargazer(bp_sys_el_reg2, bp_dia_el_reg2, weight_el_reg2, type="latex", se = list(robust_se_bp_sys_el2, robust_se_bp_dia_el2, robust_se_weight_el2), title = "Endline Health Outcomes, Ideal Sample",  dep.var.labels = c("BP (systolic)", "BP (diastolic)", "Weight (kg)"), covariate.labels = c("Discount", "Fixed CC", "Personalized CC", "Fixed CC + Discount", "Personalized CC + Discount"), align=TRUE, no.space=TRUE, header=FALSE, omit = c("bp_sys", "bp_dia", "weight"))
```

\pagebreak



# Extensions



## Balance Checks

In the @bai_2020 paper, there is little discussion of potential failures of randomization. It is not clear whether some participants were lost to follow-up, for example. While the authors conduct a balance check in their table A1, a limited set of variables were included. Furthermore, the authors did not conduct a joint test of whether the differences in covariates significantly varied across treatment arms. Instead, the authors tested each covariate individually across treatment arms.

The @bai_2020 paper included six treatment arms. There was a full control group; a discount arm that provided subjects discounts on preventive health services; a standard commitment contract (CC) arm that offered a CC where subjects paid a fixed amount upfront that would be repaid if they went to a preventive health visit; a standard CC plus discount arm, which combined the standard CC with a discount; a personalized CC arm that offered a CC where subjects could customize the amount that they paid upfront as a commitment device; and a personalized CC plus discount arm, which combined the personalized CC with a discount.

To facilitate testing whether there was covariate balance across the treatment arms, we defined new binary treatment variables. One variable is 1 if the subject was assigned to a treatment arm that involved provision of a commitment contract and 0 otherwise. A second binary treatment variable is 1 if the subject was given a discount and 0 otherwise. Using a F-test, we tested the null hypothesis that all covariate regression coefficients were 0 in regressions with the binary treatment variables as dependent variables. We first examine the set of covariates the authors use for their balance checks and then consider an alternate set of covariates that removes variables with high frequency of missing values and includes additional covariates that measure behavioral tendencies of respondents, including whether they are taking medications for hypertension, whether they drink alcohol, whether they smoke, whether they report that they procrastinate, and whether they report that they are impatient. 

We find that a number of behavioral variables are significantly correlated with treatment, and further, an F-test of joint significance suggests there may be an imbalance in observable characteristics for some covariate sets for both the commitment contract treatments and the discount treatments.

These results suggest that even though treatment was randomized, we may see significant differences in important characteristics across the treatment groups. In particular, the results suggest that fewer people in the groups that offered commitment contracts were taking medicine for their hypertension at baseline. In the data we see a positive correlation between individuals taking medication at baseline and whether they make a preventive care visit. A potential explanation is that individuals who take medicine for hypertension at baseline are more likely to engage in other preventive health behaviors. 

These results suggest that we should be cautious in trusting the results of simple comparisons in outcomes across treatment groups, since there are differences in observable characteristics and likely also unobservable characteristics.

```{r results = 'asis', message = FALSE, warning = FALSE, echo = FALSE}
# Balance checks
library(car)
library(lmtest)
library(sandwich)
library(stargazer)

df$cc_all<- 0
df$cc_all[df$fixcc_std==1 | df$fixcc_discnt==1 | df$flexcc_std==1 | df$flexcc_discnt==1] <- 1
cc_balance_lm <- lm(cc_all ~ hh_size + gender + age + literate + hh_income + seagri + bp_sys + bp_dia + weight + hbp_freq_check + trust_ehp, data=subset(df, PanelB_Sample == 1))
#linearHypothesis(cc_balance_lm, c("hh_size=0", "gender=0", "age=0", "literate=0", "seagri=0", "bp_sys=0", "bp_dia=0", "weight=0", "hbp_freq_check=0", "trust_ehp=0"))
cc_balance_lm_2 <- lm(cc_all ~ hh_size + gender + age + literate + hh_income + taking_meds+ alchl + smoke+ procrastinate + impatient  + hbp_freq_check, data=subset(df, PanelA_Sample == 1))
#linearHypothesis(cc_balance_lm_2, c("hh_size=0", "gender=0", "age=0", "literate=0", "taking_meds=0", "alchl=0", "smoke=0", "procrastinate=0",  "impatient=0", "hbp_freq_check=0"))


df$disc_all<-0
df$disc_all[df$discnt==1 | df$fixcc_discnt==1 |  df$flexcc_discnt==1]  <- 1
disc_balance_lm <- lm(disc_all ~ hh_size + gender + age + literate + hh_income + seagri + bp_sys + bp_dia + weight + hbp_freq_check + trust_ehp, data=subset(df, PanelB_Sample == 1))
#linearHypothesis(disc_balance_lm, c("hh_size=0", "gender=0", "age=0", "literate=0", "seagri=0", "bp_sys=0", "bp_dia=0", "weight=0", "hbp_freq_check=0", "trust_ehp=0"))
disc_balance_lm_2 <- lm(disc_all ~ hh_size + gender + age + literate + hh_income + taking_meds+ alchl + smoke+ procrastinate + impatient  + hbp_freq_check, data=subset(df, PanelA_Sample == 1))
#linearHypothesis(disc_balance_lm_2, c("hh_size=0", "gender=0", "age=0", "literate=0", "taking_meds=0", "alchl=0", "smoke=0", "procrastinate=0",  "impatient=0", "hbp_freq_check=0"))
stargazer(cc_balance_lm, cc_balance_lm_2, disc_balance_lm, disc_balance_lm_2, type="latex",  title = "Balance Checks, Alternative Covariate Sets",  dep.var.labels = c("Any CC","Any Discount"), covariate.labels = c("HH Size", "Gender", "Age", "Literate", "HH Income","Agricultural Job","BP (systolic)","BP (diastolic)","Weight (kg)", "Prior Medication", "Prior Alcohol", "Prior Smoking", "Procrastinates", "Is Impatient", "Possible to Control BP","Trusts Provider"), align=TRUE, no.space=TRUE, header=FALSE, font.size = "small")

library(tidyverse)
library(foreign)
library(StatMatch)
library(broom)


cov_means <- df %>% 
  group_by(cc_all) %>% 
  summarise(across(c(hh_size, gender, age,literate,hh_income,taking_meds,alchl,smoke,procrastinate,impatient,hbp_freq_check), mean, na.rm=TRUE)) %>%
  t() %>%
  as_tibble(rownames = 'var') %>%
  slice(-1) %>%
  rename(mean_control = 2,
         mean_treated = 3)

## Variance in treated / control group 

cov_var <- df %>% 
  group_by(cc_all) %>% 
  summarise(across(c(hh_size, gender, age,literate,hh_income,taking_meds,alchl,smoke,procrastinate,impatient,hbp_freq_check), var, na.rm=TRUE)) %>%
  t() %>%
  as_tibble(rownames = 'var') %>%
  slice(-1) %>%
  rename(var_control = 2,
         var_treated = 3)

## Combine 
## Calculate standardized mean difference (SMD)

bal_pre <- cov_means %>% 
  left_join(cov_var) %>%
  mutate(mean_diff = mean_treated - mean_control,
         SMD = mean_diff / sqrt((var_control + var_treated)/2)) %>%
  mutate(type = 'Before matching')


ggplot(bal_pre, aes(x = var, y = SMD)) + 
  geom_bar(stat = 'identity',
           col = 'black') +
  coord_flip() + 
  labs(y = 'Normalized difference',
       x = 'Variable') + ggtitle("Figure 1: Balance plot for selected variables, CC treatment groups compared with control groups")+
  theme_bw()



```
\pagebreak

## Matching Analysis
Given the modest imbalance in variable values, we explored whether matching would result in different treatment effects than those that the authors of the original manuscript reported. We match observations that were randomized to receive any commitment contract to those that did not receive a commitment contract. We use the variables in the balance plot shown in Figure 1 to match treatment observations with similar control observations, and we use the Mahalanobis distance, matching with replacement to identify the most suitable control observations.

We find that the results in Table 7, which reproduce the results in Table 2 for the matched sample, are generally similar to the original results. This finding suggests that, while there is some modest but significant imbalance for certain covariates with respect to receipt of the commitment contracts, reducing this imbalance does not appear to lead to substantial differences in the primary outcomes of interest.

```{r results = 'asis', message = FALSE, warning = FALSE, echo = FALSE}

# Matching
library(MatchIt)
library(optmatch)
library(StatMatch)
library(tidyverse)
library(broom)
vars_keep <- c('age', 'hh_size', 'gender', 'literate', 'hh_income', 'taking_meds', 'alchl', 'smoke', 'procrastinate', 'impatient', 'hbp_freq_check', 'cc_all', 'PanelA_Sample', 'takeup',  'discnt', 'fixcc_std' , 'flexcc_std' , 'fixcc_discnt' , 'flexcc_discnt', 'visitfrac', 'anyvisit')
df2<-subset(df,!is.na(df$age) & !is.na(df$hh_size) & !is.na(df$gender) & !is.na(df$literate) & !is.na(df$hh_income) & !is.na(df$taking_meds) & !is.na(df$alchl) & !is.na(df$smoke) & !is.na(df$procrastinate) & !is.na(df$impatient) & !is.na(df$hbp_freq_check) & PanelA_Sample == 1, select = vars_keep)
big_X <- df2 %>% 
  dplyr::select(c(hh_size, gender, age,literate,hh_income,taking_meds,alchl,smoke,procrastinate,impatient,hbp_freq_check)) %>% 
  as.matrix()

## Only treated units 

X_treated <- df2 %>%
  filter(cc_all == 1 ) %>%
  dplyr::select(c(hh_size, gender, age,literate,hh_income,taking_meds,alchl,smoke,procrastinate,impatient,hbp_freq_check)) %>% 
  as.matrix()

## Only control units 

X_control <- df2 %>%
  filter(cc_all == 0 ) %>%
  dplyr::select(c(hh_size, gender, age,literate,hh_income,taking_meds,alchl,smoke,procrastinate,impatient,hbp_freq_check)) %>% 
  as.matrix()

## Calculate Mahalanobis Distance between all observation-pairs 
## using mahalanobis.dist function from the StatMatch package 

M <- mahalanobis.dist(data.x = X_treated, 
                      data.y = X_control,
                      vc = var(big_X))

## For each treated unit, find closest control unit 
## in terms of Mahalanobis Distance 

match_ids <- unlist(apply(M, 1, function(i){which(i == min(i))[1]}))

## Create matched dataset 

matched_data <- rbind(subset(df2, cc_all == 1),
                      subset(df2, cc_all == 0)[match_ids,])


## Check balance after matching ! 
## Exactly the same procedure as before (lines 10 - 45)

cov_means_matched <- matched_data %>% 
  group_by(cc_all) %>% 
  summarise(across(c(hh_size, gender, age,literate,hh_income,taking_meds,alchl,smoke,procrastinate,impatient,hbp_freq_check), mean,na.rm=TRUE)) %>%
  t() %>%
  as_tibble(rownames = 'var') %>%
  slice(-1) %>%
  rename(mean_control = 2,
         mean_treated = 3)


cov_var_matched <- matched_data %>% 
  group_by(cc_all) %>% 
  summarise(across(c(hh_size, gender, age,literate,hh_income,taking_meds,alchl,smoke,procrastinate,impatient,hbp_freq_check), var, na.rm=TRUE)) %>%
  t() %>%
  as_tibble(rownames = 'var') %>%
  slice(-1) %>%
  rename(var_control = 2,
         var_treated = 3)

bal_post <- cov_means_matched %>% 
  left_join(cov_var_matched) %>%
  mutate(mean_diff = mean_treated - mean_control,
         SMD = mean_diff / sqrt((var_control + var_treated)/2)) %>%
  mutate(type = 'After matching')

## Compare balance before and after matching visually 

#ggplot(rbind(bal_pre, bal_post), aes(x = var, y = SMD, 
 #                                    col = type, group = type, shape = type)) +
#  geom_point() + 
#  coord_flip() + 
#  labs(y = 'Normalized difference',
 #      x = 'Variable') + 
 # theme_bw() + 
#  geom_hline(yintercept = 0,
#             linetype = 'dotted') + 
#  theme(legend.position = 'bottom',
#        legend.title = element_blank())


## Post-matching estimation 

takeup_reg <- lm(takeup ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt, data=subset(matched_data, PanelA_Sample == 1))

anyvisit_reg <- lm(anyvisit ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt, data=subset(matched_data, PanelA_Sample == 1))

visitfrac_reg <- lm(visitfrac ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt, data=subset(matched_data, PanelA_Sample == 1))

covtakeup <- vcovHC(takeup_reg, type="HC1")
robust_se_takeup <- sqrt(diag(covtakeup))

covanyvisit <- vcovHC(anyvisit_reg, type="HC1")
robust_se_anyvisit <- sqrt(diag(covanyvisit))

covvisitfrac <- vcovHC(anyvisit_reg, type="HC1")
robust_se_visitfrac <- sqrt(diag(covvisitfrac))

#stargazer(takeup_reg, anyvisit_reg, visitfrac_reg, type="latex", se = list(robust_se_takeup, robust_se_anyvisit, robust_se_visitfrac),  title = "Take-up and Service Utilization, Matched Section Code Sample", dep.var.labels = c("Take-up", "Any visit", "Proportion of visits"), covariate.labels = c("Discount", "Fixed CC", "Personalized CC", "Fixed CC + Discount", "Personalized CC + Discount"), align=TRUE, no.space=TRUE, header=FALSE)

#lm(takeup ~ cc_all, data = matched_data) %>% tidy() %>% filter(term == 'cc_all')
#lm(anyvisit ~ cc_all, data = matched_data) %>% tidy() %>% filter(term == 'cc_all')
#lm(visitfrac ~ cc_all, data = matched_data) %>% tidy() %>% filter(term == 'cc_all')


match_dist <- matchit(cc_all ~ hh_size+ gender+ age+literate+hh_income+taking_meds+alchl+smoke+procrastinate+impatient+hbp_freq_check, data = df2,
method = "nearest",
distance = "mahalanobis", replace=TRUE,generate.factors=TRUE)

nearest_dat <- match.data(match_dist)

takeup_m2<-lm(takeup ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt, data=nearest_dat, weights=weights)
anyvisit_m2<-lm(anyvisit ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt, data=nearest_dat, weights=weights)
visitfrac_m2<-lm(visitfrac ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt, data=nearest_dat, weights=weights)

stargazer(takeup_m2, anyvisit_m2, visitfrac_m2, type="latex", se = list(robust_se_takeup, robust_se_anyvisit, robust_se_visitfrac),  title = "Take-up and Service Utilization, Matched Sample", dep.var.labels = c("Take-up", "Any visit", "Proportion of visits"), covariate.labels = c("Discount", "Fixed CC", "Personalized CC", "Fixed CC + Discount", "Personalized CC + Discount"), align=TRUE, no.space=TRUE, header=FALSE)

```

\pagebreak

## Alternative models 

\par

In this section, we compare the main estimates from Table 2 in the Bai et al (2020) paper with. In Table 2 of the Bai et al (2020) paper, the first two columns are binary variables which report the commitment contracts take-up, while the third column reports the number of visits an individual had during the endline survey divided by the recommended number of visits at three visits. These three columns were estimated using linear regressions. 

### Logistic regression

Consider the first two columns where the dependent variable is a binary indicator. These estimates are estimated from a linear probability model, where the stochastic component is $Y_i \sim \text{Bernoulli}(y_i|pi_i)$. The systematic component is defined as $Pr(Y_i = 1 | \beta) \equiv E(Y_i) \equiv \pi_i = x_i\beta$. This assumes that $Y_i$ has a linear relationship with the covariates $x_i$. We were curious about whether the results would change if we assumed that the $Y_i$ has a non-linear relationship with the covariates $x_i$. Specifically, what if we assumed that the systematic component follows a logistic specification: $Pr(Y_i = 1 | \beta) = \frac{1}{e^{-x_i\beta}}$. The following table shows our results:


```{r results = 'asis', message = FALSE, warning = FALSE, echo = FALSE}
library(MASS)
library(stargazer)
df1<-subset(df, PanelA_Sample == 1)


# Regression for Table 2, Panel A
library(lmtest)
library(sandwich)
library(stargazer)

takeup_logit <- glm(takeup ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt, data=subset(df, PanelA_Sample == 1), family = 'binomial')

anyvisit_logit <- glm(anyvisit ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt, data=subset(df, PanelA_Sample == 1), family='binomial')

stargazer(takeup_logit, anyvisit_logit, type="latex",title = "Take-up and Service Utilization, Full Sample (Logit)", dep.var.labels = c("Take-up", "Any visit"), covariate.labels = c("Discount", "Fixed CC", "Personalized CC", "Fixed CC + Discount", "Personalized CC + Discount"), align=TRUE, no.space=TRUE, header=FALSE , table.placement = "H")
```

Given the large coefficients and the statistical significance, these results provide us with confidence that the findings from Table 2 are not a result of model dependence. 

### Poisson regression

We then move on to examine the insignificance of the results presented in Table 2 of the Bai et al (2020) paper. Specifically, we were curious about the third column of Table 2 which examines the proportion of visits out of the recommended three visits. The authors decided to model the total number of visits as proportions instead of as count values, and we were curious if fitting a count model such as a Poisson regression would alter the results on the last column of Table 2. Specifically, instead of using the 'visitfrac' variable, we use the `visittotal' variables. We provide a summary statistics table of the two variables in the following table:

```{r results = 'asis', message = FALSE, warning = FALSE, echo = FALSE}
stargazer(subset(df, select = c('visitfrac', 'visittotal')), header=FALSE , table.placement = "H", title = "Summary statistics for 'visittotal' and 'visitfrac'")
```

Recall that the stochastic component of the Poisson regression is that $Y_i \sim \text{Poisson}(y_i|\lambda_i)$ and the systematic component is $\lambda_i = \text{exp}(x_i \beta)$, our regression estimates are presented in the table below alongside the same outcome specification using a linear model:

```{r results = 'asis', message = FALSE, warning = FALSE, echo = FALSE}
poisson <- glm(visittotal ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt, data = df1, family = "poisson")
m1 <- lm(visittotal ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt, data = df1)
stargazer(poisson, m1, dep.var.labels = c("Total number of visit"), title = "Poisson regression and OLS regression results", covariate.labels = c("Discount", "Fixed CC", "Personalized CC", "Fixed CC + Discount", "Personalized CC + Discount"), align=TRUE, no.space=TRUE, header=FALSE , table.placement = "H")

```

Our poisson results do seem to suggest that for certain treatment groups, the impact of the commitment contracts have a positive and statistically significant effect on the total number of visits to a clinic. This is in contrast with the conclusions established by the authors, which found that the commitment contracts had a small and statisticall insignificant effect on clinic visits. However, an underlying question remains if a poisson regression is suitable in this particular research design. One assumption that defines the poisson regression is that the mean and the variance of $Y_i \sim \text{Poisson}(y_i|\lambda_i)$ are equal. We test for this assumption by conducting an overdispersion/underdispersion test in the spirit of @cameron1990regression. We find that we reject the null hypothesis of no overdispersion and estimate that the dispersion statistic $\Phi$ defined as: $Var(Y|X )= E(Y|X) + \Phi[E(Y|X)]^2$ takes the value of 2.9. 


```{r results = FALSE, message = FALSE, warning = FALSE, echo = FALSE}
library(AER)
dispersiontest(poisson)
```

Given that there is evidence of overdispersion, we proceed by fitting the quasi-Poisson model to examine if our results on the Poisson regression holds. 

```{r results = 'asis', message = FALSE, warning = FALSE, echo = FALSE}
overpoisson <- glm(visittotal ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt, data = df1, family = "quasipoisson")
stargazer(overpoisson, dep.var.labels = c("Total number of visit"), title = "Quasi-Poisson regression results", covariate.labels = c("Discount", "Fixed CC", "Personalized CC", "Fixed CC + Discount", "Personalized CC + Discount"), align=TRUE, no.space=TRUE, header=FALSE , table.placement = "H")
```

The results suggest that after correcting for over-dispersion, the statistical significance of our poisson model results attenuated. This suggest that the authors' approach of estimating the basic linear regression model is suitable for this specific research design. Overall, our findings strengthen the valdiity of the original results. 

### Multiple Equation Models

The authors exmaine an extensive set of outcome variables: takeup of commitment contracts, preventive health care visits, blood pressure, weight, alcohol use, smoking, weight, exercise. These outcomes are measure both at baseline and at the end of the study. The authors use a framework for analysis that considers each outcome variable in isolation and tests whether there is a significant effect of being in a particular treatment group on the end-line outcomes. When we are estimating the effects of treatment on multiple outcomes, we may be able to increase efficiency by considering a multivariate framework that accounts for the covariances between equations. One option is the seemingly unrelated regression (SUR) framework proposed by @zellner1962efficient.

We consider two SUR models. One includes all outcome measures, including endline health measurements. Specifically, the full set of outcome measures includes: take-up of a commitment contract, any preventive care visit, preventive care visit frequency, prehypertension at endline, blood pressure at endline, weight at endline, BMI at endline, smoking at endline, alcohol use at endline, and exercise at endline. Because a sizable proportion of the population did not receive endline health measurements, we also consider a model that excludes endline health outcomes. 

Overall, the results suggest that the original regression results are fairly precisely estimated, but the SUR framework also allows us to test additional hypothesis tests. In particular, we can test the hypothesis that the effects of the treatment are null across multiple outcome measures (Table 12). From these results we can reject the hypothesis that being in the treatment arms has no effect on any outcomes, except for the pure discount treatment arm which shows only a marginally significant effect across the set of outcomes.

We also present a limited subset of the SUR regression results, showing the coefficients for the outcomes of take-up, any preventive care visit, and preventive care visit frequency in Table 13. The two columns of Table 13 show the results corresponding to the two SUR models. The first column provides results for the model that only included these three outcome variables. The second column provides results for the model that included additional outcome variables, omitting the coefficients for these additional endline outcomes from the table. We did not find significant effects of the treatments on these additional outcomes.


```{r results = 'asis', message = FALSE, warning = FALSE, echo = FALSE}
# SUREG
library(texreg)
library(systemfit)
library(car)
library(lmtest)
library(sandwich)
library(stargazer)

df$village2 <- df$vill_id==2
df$village3 <- df$vill_id==3
df$village4 <- df$vill_id==4

takeup_eq <- takeup ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt + procrastinate +  village2 + village3 + village4
anyvisit_eq <- anyvisit ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt + impatient +  distance_ehp + hh_size + taking_meds + village2 + village3 + village4
visitfrac_eq <- visitfrac ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt + impatient +  distance_ehp + hh_size + taking_meds + hh_income + village2 + village3 + village4
prehyper_eq <- prehyper_el ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt + age + gender + prehyper_bl + bp_sys + bp_dia + weight + taking_meds
bp_sys_eq <- bp_sys_el ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt + age + gender + bp_sys + bp_dia + weight + taking_meds
bp_dia_eq <- bp_sys_el ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt + age + gender + bp_sys + bp_dia + weight + taking_meds
weight_eq <- weight_el ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt + age + gender + weight + waist + exrc + smoke + taking_meds
bmi_eq <- bmi_el ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt + age + gender + weight + height+  waist + exrc +smoke + taking_meds
smoke_eq <- smoke_el ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt + age + gender  + smoke + alchl
alchl_eq <- alchl_el ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt + age + gender + smoke + alchl
exrc_eq <- exrc_el ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt + age + gender + smoke + alchl +exrc


restrict_disc<- c("takeup_discnt = 0","anyvisit_discnt=0","visitfrac_discnt=0")
restrict_fixcc_std<- c("takeup_fixcc_std = 0","anyvisit_fixcc_std=0","visitfrac_fixcc_std=0")
restrict_fixcc_discnt<- c("takeup_fixcc_discnt = 0","anyvisit_fixcc_discnt=0","visitfrac_fixcc_discnt=0")
restrict_flexcc_std<- c("takeup_flexcc_std = 0","anyvisit_flexcc_std=0","visitfrac_flexcc_std=0")
restrict_flexcc_discnt<- c("takeup_flexcc_discnt = 0","anyvisit_flexcc_discnt=0","visitfrac_flexcc_discnt=0")

system_basic <- list(takeup=takeup_eq,anyvisit=anyvisit_eq, visitfrac=visitfrac_eq)
#system_full <- list(takeup_eq,anyvisit_eq, visitfrac_eq,prehyper_eq,bp_sys_eq,bp_dia_eq, weight_eq,bmi_eq,smoke_eq,alchl_eq,exrc_eq)
system_full <- list(takeup=takeup_eq,anyvisit=anyvisit_eq, visitfrac=visitfrac_eq,prehyper=prehyper_eq,bpsys=bp_sys_eq,weight=weight_eq,bmi=bmi_eq, smoke=smoke_eq,alchl=alchl_eq,exrc=exrc_eq)

fit_sur_basic<- systemfit(system_basic,"SUR", data=subset(df, PanelA_Sample == 1))
fit_sur_full<- systemfit(system_full,"SUR", data=subset(df, PanelB_Sample == 1))

t1 <- linearHypothesis(fit_sur_basic, restrict_disc, test = "Chisq")
t2 <- linearHypothesis(fit_sur_basic, restrict_fixcc_std, test = "Chisq")
t3 <- linearHypothesis(fit_sur_basic, restrict_fixcc_discnt, test = "Chisq")
t4 <- linearHypothesis(fit_sur_basic, restrict_flexcc_std, test = "Chisq")
t5 <- linearHypothesis(fit_sur_basic, restrict_flexcc_discnt, test = "Chisq")

names<- c("Discount","Fixed CC","Fixed CC + Discount","Personalized CC","Personalized CC + Discount")
pvalues<- c(t1$`Pr(>Chisq)`,t2$`Pr(>Chisq)`,t3$`Pr(>Chisq)`,t4$`Pr(>Chisq)`,t5$`Pr(>Chisq)`)
suregpvaluedf<-data.frame(names,pvalues[!is.na(pvalues)])
names(suregpvaluedf)[1]<- "Treatment arm"
names(suregpvaluedf)[2]<- "P value"

knitr::kable(suregpvaluedf, format="latex",caption="P Values from SUR Joint Hypothesis Tests")
```

```{r results = 'asis', message = FALSE, warning = FALSE, echo = FALSE}

texreg(list(fit_sur_basic,fit_sur_full), caption.above=TRUE,caption = "Seemingly Unrelated Regression Results", fontsize = "small",custom.model.names=c("Limited Outcome Set","Full Outcome Set"),omit.coef="village|distance|prehyper|bpsys|weight|exrc|bmi|smoke|alchl|Intercept|obs.|meds|procrastinate|impatient|hh")



```


\pagebreak



## Permutation test

The original linear model in @bai_2020 paper implicitly assumes that the outcome of interest is normally distributed given a set of covariates. We relax this assumption by implementing a permutation test which requires no normality assumption. The advantage of permutation test is that it doesn't construct p-values by assuming a theoretical distribution of the chosen statistics as in parametric settings. Instead, it constrcut p-values for the test by randomly shuffling the observed data, fitting the model repeately for every shuffle, and obtaining an empirical distribution of the statistics. 

One key factor that validates our use of the permutaion test as a extension strategy is the random assignment of treatment groups in the original experimental set-up. This is because the permutation method relies on an essential assumption that samples are exchangeable under a true null hypothesis. Using this test in conditions where the assumption is not met may introduce other statistical biases. For instance, @hayes_permutation_1996 shows that type I error rates may become inflated if permutation tests are used with non-randomized data. In the experimental data we are working with, this assumption of exchangeability is automatically met with a randomized set-up.

We conduct the permutation test using the R package lmPerm [@wheeler_lmperm_2016] which produces tests for linear models. The results for the permutation analysis for full sample and ideal sample are in table 13 and 14 respectively. They differ only in the p-value from the origin regression results In Bai's paper. Overall, there is good agreement between the p-values between the two sets of results. However, one interesting (and consistent) result we find is that the health outcome `weight` at endline seems to be statistically significant different across treatment groups in both full sample and ideal sample under permutation tests, as opposed to no difference in the original paper.


```{r results = 'asis', message = FALSE, warning = FALSE, echo = FALSE, results='hide'}
# Regression for Table 2
library(lmPerm)
set.seed(02138)
summary(takeup_reg2 <- lmp(takeup ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt, data=subset(df, PanelA_Sample == 1), perm="Prob", seqs = T))

summary(anyvisit_reg2 <- lmp(anyvisit ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt, data=subset(df, PanelA_Sample == 1), perm="Prob", seqs = T))

summary(visitfrac_reg2 <- lmp(visitfrac ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt, data=subset(df, PanelA_Sample == 1), perm="Prob", seqs = T))

summary(lmp(bp_sys_el ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt + bp_sys + bp_dia + weight, data=subset(df, PanelB_Sample == 1), perm="Prob", seqs = T))

summary(lmp(bp_dia_el ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt + bp_sys + bp_dia + weight, data=subset(df, PanelB_Sample == 1), perm="Prob", seqs = T))

summary(lmp(weight_el ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt + bp_sys + bp_dia + weight, data=subset(df, PanelB_Sample == 1), perm="Prob", seqs = T))
```


\begin{sidewaystable}[!htbp] \centering 
  \caption{Permutation test, Full Sample} 
  \label{} 
\begin{tabular}{@{\extracolsep{5pt}}lcccccc} 
\\[-1.8ex]\hline 
\hline \\[-1.8ex] 
 & \multicolumn{6}{c}{\textit{Dependent variable:}} \\ 
\cline{2-7} 
\\[-1.8ex] & Take-up & Any visit & Proportion of visits & BP (systolic) & BP (diastolic) & Weight (kg) \\ 
\\[-1.8ex] & (1) & (2) & (3) & (4) & (5) & (6)\\ 
\hline \\[-1.8ex] 
 Discount & 0.00002$^{***}$ & 0.056 & 0.050$^{***}$ & 0.614 & 0.081 & $-$0.162 \\ 
  & (<0.001) & (0.592) & (<0.001) & (1.000) & (0.099) & (1.000) \\ 
  & & & & & & \\ 
 Fixed CC & 0.133$^{***}$ & 0.006 & 0.008 & $-$0.286 & 0.031 & $-$0.442$^{***}$ \\ 
  & (<0.001) & (0.824) & (0.096) & (0.151) & (1.000) & (<0.001) \\ 
  & & & & & & \\ 
 Personalized CC & 0.138$^{***}$ & 0.010 & 0.020 & 0.600 & 1.933 & $-$0.128$^{***}$ \\ 
  & (<0.001) & (1.000) & (1.000) & (0.619) & (0.060) & (<0.001) \\ 
  & & & & & & \\ 
 Fixed CC + Discount & 0.254$^{***}$ & 0.036 & 0.015 & $-$0.461$^{*}$ & $-$0.082 & 0.173 \\ 
  & (<0.001) & (1.000) & (1.000) & (0.044) & (0.074) & (0.066) \\ 
  & & & & & & \\ 
 Personalized C + Discount & 0.384$^{***}$ & 0.048 & 0.042$^{***}$ & 1.105 & 1.747 & $-$0.182$^{***}$ \\ 
  & (<0.001) & (0.132) & (<0.001) & (0.606) & (1.000) & (<0.001) \\ 
  & & & & & & \\ 
\hline \\[-1.8ex] 
\textit{Note:}  & \multicolumn{6}{r}{p-value of the permutation test in parentheses} \\ 
\end{tabular} 
\end{sidewaystable} 



```{r results = 'asis', message = FALSE, warning = FALSE, echo = FALSE, results='hide'}
# Regression for Table 3
summary(lmp(takeup ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt, data=subset(df, Target_PanelA_Sample == 1), perm="Prob", seqs = T))

summary(lmp(anyvisit ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt, data=subset(df, Target_PanelA_Sample == 1), perm="Prob", seqs = T))

summary(lmp(visitfrac ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt, data=subset(df, Target_PanelA_Sample == 1), perm="Prob", seqs = T))

summary(lmp(bp_sys_el ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt + bp_sys + bp_dia + weight, data=subset(df, Target_PanelB_Sample == 1), perm="Prob", seqs = T))

summary(lmp(bp_dia_el ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt + bp_sys + bp_dia + weight, data=subset(df, Target_PanelB_Sample == 1), perm="Prob", seqs = T))

summary(lmp(weight_el ~ discnt + fixcc_std + flexcc_std + fixcc_discnt + flexcc_discnt + bp_sys + bp_dia + weight, data=subset(df, Target_PanelB_Sample == 1), perm="Prob", seqs = T))
```

\begin{sidewaystable}[!htbp] \centering 
  \caption{Permutation test, Ideal Sample} 
  \label{} 
\begin{tabular}{@{\extracolsep{5pt}}lcccccc} 
\\[-1.8ex]\hline 
\hline \\[-1.8ex] 
 & \multicolumn{6}{c}{\textit{Dependent variable:}} \\ 
\cline{2-7} 
\\[-1.8ex] & Take-up & Any visit & Proportion of visits & BP (systolic) & BP (diastolic) & Weight (kg) \\ 
\\[-1.8ex] & (1) & (2) & (3) & (4) & (5) & (6)\\ 
\hline \\[-1.8ex] 
 Discount & $-$0.000$^{***}$ & 0.106 & 0.119 & 0.276 & $-$0.988 & $-$0.318$^{***}$ \\ 
  & (<0.001) & (0.633) & (0.444) & (1.000) & (1.000) & (<0.001)\\ 
  & & & & & & \\ 
 Fixed CC & 0.172 & 0.107 & 0.077 & $-$0.802 & $-$0.638 & $-$0.910$^{***}$ \\ 
  & (0.056) & (0.941) & (0.218) & (1.000) & (0.167) & (<0.001)\\ 
  & & & & & & \\ 
 Personalized CC & 0.192 & 0.031 & 0.025 & 3.557 & 3.526 & $-$0.228 \\ 
  & (0.477) & (0.189) & (0.642) & (0.863) & (0.219) & (0.594) \\ 
  & & & & & & \\ 
 Fixed CC + Discount & 0.319 & 0.074 & 0.040 & 0.978 & 1.380$^{*}$ & $-$0.972$^{***}$ \\ 
  & (0.346) & (0.540) & (1.000) & (0.080) & (0.045) & (<0.001) \\ 
  & & & & & & \\ 
 Personalized CC + Discount & 0.494$^{***}$ & 0.130$^{**}$ & 0.121 & 3.967$^{**}$ & 4.724 & $-$0.553$^{***}$ \\ 
  & (<0.001) & (0.004) & (0.070) & (0.003) & (1.000) & (<0.001) \\ 
  & & & & & & \\ 
\hline \\[-1.8ex] 
\textit{Note:}  & \multicolumn{6}{r}{p-value of the permutation test in parentheses} \\ 
\end{tabular} 
\end{sidewaystable} 


\pagebreak


## Targeting of commitment contracts

Even if commitment contracts have little effect on the average person in the sample of @bai_2020, we might find that offering commitment contracts is more or less effective for some people than others. We conduct additional regression analysis to test for interaction effects with the commitment contract treatments using the SUR framework to increase efficiency. To simplify our regression specifications, we use a single binary variable to represent whether a subject was randomized to receive a commitment contract, as described previously. We then interact the treatment with each of the variables we include in our matching analysis described above. We examine regression models that consider each interaction effect in a separate model. An additional model that includes these interaction effects together in a single model.

We find that age significantly interacts with the commitment contract treatments in predicting response to the outcome of any preventive health visit. Higher age was significantly associated (at the 5% level) with a lower frequency of attending at least one preventive health visit conditional on being randomized to a commitment contract treatment. It could be the case that older subjects have greater cognitive difficulties following through with commitment contracts or that they forget their commitments more easily or that they are more used to not following through with commitments.

Next, we find that household size had a significant interaction effect for the number of preventive visits but not other outcomes. Larger household size was associated with more preventive health visits in response to the commitment contract treatments. This effect is likely driven by unobserved characteristics of subjects that are correlated with household size; perhaps subjects with larger households are more organized or more experienced with balancing priorities and thus are more easily able to follow through with commitments.

Finally, the survey asked participants whether they believed themselves to be impatient. Those who reported that they strongly agreed that they were impatient were coded to a value of 1 and others to a value of 0. Subjects who reported impatience attended at least one preventive health visit more frequently than did subjects who did not report impatience. Self-reported impatience was also associated with a higher number of preventive visits. These results suggest that these subjects who self-reported impatience might be more aware of their self-control problems. If they are more aware, then they may see greater benefits from commitment contracts, because they may also be more aware of what commitments they are able to keep, and so they may be less likely to take-up commitment contracts that they ultimately fail to follow through with.

When we included all three of the above interaction terms in the same regression model, we found that the interaction effects for the impatient variable were no longer statistically significant, though the coefficients on the impatient interaction terms were qualitatively similar to those in the regressions that included one interaction term per model. These results suggest that the interaction effects identified above are not independent; for example, the pathway by which impatience affects the commitment contract treatment effects may be similar to the pathway by which age affects these impacts, perhaps due to individuals' cognitive status or increasing self-awareness as they age. These interaction effects provide suggestive, not conclusive evidence, that commitment contracts may work better for some individuals than others, providing opportunities for further exploration in subsequent studies.



```{r results = 'asis', message = FALSE, warning = FALSE, echo = FALSE}
# SUREG
library(texreg)
library(systemfit)
library(car)
library(lmtest)
library(sandwich)
library(stargazer)

df$village2 <- df$vill_id==2
df$village3 <- df$vill_id==3
df$village4 <- df$vill_id==4

df$short_distance <-  df$distance_ehp=="< 1Km"
df$distance_ehp_int <- df$short_distance*df$cc_all
df$hh_size_int <- df$hh_size*df$cc_all
df$impatient_int <- df$impatient*df$cc_all
df$gender_int <- df$gender*df$cc_all
df$age_int <- df$age*df$cc_all

takeup_eq <- takeup ~ cc_all  +gender + impatient+impatient_int+hh_size+hh_size_int+literate+age+age_int+village2 + village3 + village4
anyvisit_eq <- anyvisit ~ cc_all  +gender +  distance_ehp + impatient+impatient_int  + hh_size +hh_size_int+age+age_int+ taking_meds + village2 + village3 + village4
visitfrac_eq <- visitfrac ~ cc_all  +gender + age+age_int +distance_ehp + hh_size + hh_size_int+ taking_meds + hh_income + impatient+impatient_int + village2 + village3 + village4
prehyper_eq <- prehyper_el ~ cc_all + age + gender + prehyper_bl + bp_sys + bp_dia + weight + taking_meds
bp_sys_eq <- bp_sys_el ~ cc_all + age + gender + bp_sys + bp_dia + weight + taking_meds
bp_dia_eq <- bp_sys_el ~ cc_all + age + gender + bp_sys + bp_dia + weight + taking_meds
weight_eq <- weight_el ~ cc_all + age + gender + weight + waist + exrc + smoke + taking_meds
bmi_eq <- bmi_el ~ cc_all + age + gender + weight + height+  waist + exrc +smoke + taking_meds
smoke_eq <- smoke_el ~ cc_all + age + gender  + smoke + alchl
alchl_eq <- alchl_el ~ cc_all + age +gender + hh_size + impatient+impatient_int+ smoke + alchl
exrc_eq <- exrc_el ~ cc_all + age + gender + smoke + alchl +exrc

system_basic <- list(takeup=takeup_eq,anyvisit=anyvisit_eq, visitfrac=visitfrac_eq)
#system_full <- list(takeup_eq,anyvisit_eq, visitfrac_eq,prehyper_eq,bp_sys_eq,bp_dia_eq, weight_eq,bmi_eq,smoke_eq,alchl_eq,exrc_eq)
system_full <- list(takeup=takeup_eq,anyvisit=anyvisit_eq, visitfrac=visitfrac_eq,prehyper=prehyper_eq,bpsys=bp_sys_eq,weight=weight_eq,bmi=bmi_eq, smoke=smoke_eq,alchl=alchl_eq,exrc=exrc_eq)

fit_sur_basic<- systemfit(system_basic,"SUR", data=subset(df, PanelA_Sample == 1))
fit_sur_full<- systemfit(system_full,"SUR", data=subset(df, PanelB_Sample == 1))

```

```{r results = 'asis', message = FALSE, warning = FALSE, echo = FALSE}
texreg(list(fit_sur_basic,fit_sur_full), caption.above=TRUE,caption = "SUR Models with Interaction Effects", fontsize = "small",custom.model.names=c("Limited Outcome Set","Full Outcome Set"),omit.coef="takeup|village|distance|prehyper|bpsys|weight|exrc|bmi|smoke|alchl|Intercept|obs.|meds|procrastinate")
```
\pagebreak

## Conclusion
Overall, our replication exercise suggests that @bai_2020's main findings are robust to the randomization design employed, the choice of statistical model and the definition of outcome variables assumed. We address the concern of imbalance across treatment arms by matching treated individuals with similar control individuals and found the original results to be robust to the randomization design employed. We then scrutinize the choice of solely relying on an ordinary least squares approach in deriving the effects of commitment contracts by adopting the logistic and Poisson model, as well as multiple equation models to account for potential dependencies between the treatment arms. We also conduct a permutation test to ensure that the normality assumption with regards to the outcome of interest (given covariates) is robust. We find that the authors' original results were similar across these models, and that the main narrative of the original paper still holds true. 

However, using a seemingly unrelated regression model with interaction terms of specific covariates and the treatment variable revealed that there is heterogeneity in the effects of commitment contracts on preventive health visits. Specifically, older individuals are less likely to attend conditional on being assigned to a commitment contract treatment, individuals from larger household sizes are more likely to seek preventive health visits, and finally individuals who are proclaim themselves as impatient are more likely to seek preventive health care visits. Furthermore, the interaction effects of the covariates and the treatment are likely to be dependent between the covariates. For example, a younger individual from a large household may benefit more compared to a similar individual from an average household size. In conclusion, our results suggest that a nuanced interpretation of @bai_2020's main results is warranted. Specifically, while commitment contracts may be on average ineffective in augmenting health behaviors, specific subgroups may benefit from commitment contracts.

## References